Using Human Demonstrations to Improve Reinforcement Learning

نویسندگان

  • Matthew E. Taylor
  • Halit Bener Suay
  • Sonia Chernova
چکیده

This work introduces Human-Agent Transfer (HAT), an algorithm that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains. Using experiments in a simulated robot soccer domain, we show that human demonstrations transferred into a baseline policy for an agent and refined using reinforcement learning significantly improve both learning time and policy performance. Our evaluation compares three algorithmic approaches to incorporating demonstration rule summaries into transfer learning, and studies the impact of demonstration quality and quantity. Our results show that all three transfer methods lead to statistically significant improvement in performance over learning without demonstration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating reinforcement learning with human demonstrations of varying ability

This work introduces Human-Agent Transfer (HAT), an algorithm that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains. Using experiments in a simulated robot soccer domain, we show that human demonstrations transferred into a baseline policy for an agent and refined using reinforcement learning sig...

متن کامل

Integrating Human Demonstration and Reinforcement Learning: Initial Results in Human-Agent Transfer

This work introduces Human-Agent Transfer (HAT), a method that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains. Using experiments in a simulated robot soccer domain, we show that human demonstrations can be transferred into a baseline policy for an agent, and reinforcement learning can be used t...

متن کامل

Inverse Reinforce Learning with Nonparametric Behavior Clustering

Inverse Reinforcement Learning (IRL) is the task of learning a single reward function given a Markov Decision Process (MDP) without defining the reward function, and a set of demonstrations generated by humans/experts. However, in practice, it may be unreasonable to assume that human behaviors can be explained by one reward function since they may be inherently inconsistent. Also, demonstration...

متن کامل

Toward Probabilistic Safety Bounds for Robot Learning from Demonstration

Learning from demonstration is a popular method for teaching robots new skills. However, little work has looked at how to measure safety in the context of learning from demonstrations. We discuss three different types of safety problems that are important for robot learning from human demonstrations: (1) using demonstrations to evaluate the safety of a robot’s current policy, (2) using demonstr...

متن کامل

Imitation and Reinforcement Learning from Failed Demonstrations

Current work in robotic imitation learning uses successful demonstrations of a task performed by a human teacher to initialize a robot controller. Given a reward function, this learned controller can then be improved using techniques derived from reinforcement learning. We instead use failed attempts, which may be more plentiful, to initialize our controller and, taking them as illustrations of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011